Visit-With-Us¶

Business Context¶

"Visit with Us," a leading travel company, is revolutionizing the tourism industry by leveraging data-driven strategies to optimize operations and customer engagement. While introducing a new package offering, such as the Wellness Tourism Package, the company faces challenges in targeting the right customers efficiently. The manual approach to identifying potential customers is inconsistent, time-consuming, and prone to errors, leading to missed opportunities and suboptimal campaign performance.

To address these issues, the company aims to implement a scalable and automated system that integrates customer data, predicts potential buyers, and enhances decision-making for marketing strategies. By utilizing an MLOps pipeline, the company seeks to achieve seamless integration of data preprocessing, model development, deployment, and CI/CD practices for continuous improvement. This system will ensure efficient targeting of customers, timely updates to the predictive model, and adaptation to evolving customer behaviors, ultimately driving growth and customer satisfaction

Objective¶

As an MLOps Engineer at "Visit with Us," your responsibility is to design and deploy an MLOps pipeline on GitHub to automate the end-to-end workflow for predicting customer purchases. The primary objective is to build a model that predicts whether a customer will purchase the newly introduced Wellness Tourism Package before contacting them. The pipeline will include data cleaning, preprocessing, transformation, model building, training, evaluation, and deployment, ensuring consistent performance and scalability. By leveraging GitHub Actions for CI/CD integration, the system will enable automated updates, streamline model deployment, and improve operational efficiency. This robust predictive solution will empower policymakers to make data-driven decisions, enhance marketing strategies, and effectively target potential customers, thereby driving customer acquisition and business growth.

Data Description¶

The dataset contains customer and interaction data that serve as key attributes for predicting the likelihood of purchasing the Wellness Tourism Package. The detailed attributes are:

Customer Details

  • CustomerID: Unique identifier for each customer.
  • ProdTaken: Target variable indicating whether the customer has purchased a package (0: No, 1: Yes).
  • Age: Age of the customer.
  • TypeofContact: The method by which the customer was contacted (Company Invited or Self Inquiry).
  • CityTier: The city category based on development, population, and living standards (Tier 1 > Tier 2 > Tier 3).
  • Occupation: Customer's occupation (e.g., Salaried, Freelancer).
  • Gender: Gender of the customer (Male, Female).
  • NumberOfPersonVisiting: Total number of people accompanying the customer on the trip.
  • PreferredPropertyStar: Preferred hotel rating by the customer.
  • MaritalStatus: Marital status of the customer (Single, Married, Divorced).
  • NumberOfTrips: Average number of trips the customer takes annually.
  • Passport: Whether the customer holds a valid passport (0: No, 1: Yes).
  • OwnCar: Whether the customer owns a car (0: No, 1: Yes).
  • NumberOfChildrenVisiting: Number of children below age 5 accompanying the customer.
  • Designation: Customer's designation in their current organization.
  • MonthlyIncome: Gross monthly income of the customer.

Customer Interaction Data

  • PitchSatisfactionScore: Score indicating the customer's satisfaction with the sales pitch.
  • ProductPitched: The type of product pitched to the customer.
  • NumberOfFollowups: Total number of follow-ups by the salesperson after the sales pitch.-
  • DurationOfPitch: Duration of the sales pitch delivered to the customer.

Project Folder Structure¶

|--> VisitWithUs-Tourism_version_1_1
    |--> Master
        |--> Data #STORING DATASET FILE
            |--> tourism.csv
            |--> test.csv  
            |--> train.csv    
        |--> Model_Dump_JOBLIB  #STOING PROGRAM GENERATED MODELS
            |--> best_threshold.txt
            |--> best_XGBoostingClassifier.joblib
            |--> XGBoostingClassifier.joblib
            |--> XGBoostingClassifier_ConfusionMatrix.png
            |--> RandomForestClassifier.joblib
            |--> RandomForestClassifier_ConfusionMatrix.png
            |--> GradientBoostingClassifier.joblib
            |--> GradientBoostingClassifier_ConfusionMatrix.png
            |--> DecisionTreeClassifier.joblib
            |--> DecisionTreeClassifier_ConfusionMatrix.png
        |--> Deployment # STORING STREAMLIT DEPLOYMENT FILE
          |--> app.py
          |--> requirement.txt
          |--> README.md
          |--> DockerFile
    |--> Visit-With-Us-Tourism-Prediction_v1_1.ipynb
    |--> DataRegistration.py
    |--> DataPrepration.py
    |--> BuildingModels.py
    |--> HostingInHuggingFace.py
    |--> main.py
    |--> .gitignore
    |--> .env
    |--> README.md
    |--> mlruns
        |--> models
        |--> 674721534787404130
        |--> .trash
|--> .github
    |--> workflows
        |--> pipeline.yml

INSTALLING PACKAGES¶

  • Huggingface hub - To interact with Hugging face programtically like creating space, dataset, models, streamlit deployment
  • python-dotenv: For storing the credentials in .env file
  • datasets: To create datasets and loading in Hugging face
  • pandas: Data manipulations (dataframe)
  • scikit-learn: To create Ensemmble models, train test split and metrics calculation
  • XGBOOST: creating xbboosting classifier models
  • seaborn & matplotlib: to create visuals
  • JOBLIB: To create model dump
  • Streamlit: To create front end creation
In [1]:
!pip install huggingface_hub
!pip install python-dotenv
!pip install datasets
!pip install pandas
!pip install scikit-learn
!pip install xgboost
!pip install seaborn
!pip install matplotlib
!pip install joblib
!pip install stramlit
!pip install mlflow
!pip install pyngrok
!pip install setuptools
Requirement already satisfied: huggingface_hub in /usr/local/lib/python3.12/dist-packages (0.34.4)
Requirement already satisfied: filelock in /usr/local/lib/python3.12/dist-packages (from huggingface_hub) (3.19.1)
Requirement already satisfied: fsspec>=2023.5.0 in /usr/local/lib/python3.12/dist-packages (from huggingface_hub) (2025.3.0)
Requirement already satisfied: packaging>=20.9 in /usr/local/lib/python3.12/dist-packages (from huggingface_hub) (25.0)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.12/dist-packages (from huggingface_hub) (6.0.2)
Requirement already satisfied: requests in /usr/local/lib/python3.12/dist-packages (from huggingface_hub) (2.32.4)
Requirement already satisfied: tqdm>=4.42.1 in /usr/local/lib/python3.12/dist-packages (from huggingface_hub) (4.67.1)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.12/dist-packages (from huggingface_hub) (4.14.1)
Requirement already satisfied: hf-xet<2.0.0,>=1.1.3 in /usr/local/lib/python3.12/dist-packages (from huggingface_hub) (1.1.7)
Requirement already satisfied: charset_normalizer<4,>=2 in /usr/local/lib/python3.12/dist-packages (from requests->huggingface_hub) (3.4.3)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.12/dist-packages (from requests->huggingface_hub) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.12/dist-packages (from requests->huggingface_hub) (2.5.0)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.12/dist-packages (from requests->huggingface_hub) (2025.8.3)
Requirement already satisfied: python-dotenv in /usr/local/lib/python3.12/dist-packages (1.1.1)
Requirement already satisfied: datasets in /usr/local/lib/python3.12/dist-packages (4.0.0)
Requirement already satisfied: filelock in /usr/local/lib/python3.12/dist-packages (from datasets) (3.19.1)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.12/dist-packages (from datasets) (2.0.2)
Requirement already satisfied: pyarrow>=15.0.0 in /usr/local/lib/python3.12/dist-packages (from datasets) (18.1.0)
Requirement already satisfied: dill<0.3.9,>=0.3.0 in /usr/local/lib/python3.12/dist-packages (from datasets) (0.3.8)
Requirement already satisfied: pandas in /usr/local/lib/python3.12/dist-packages (from datasets) (2.2.2)
Requirement already satisfied: requests>=2.32.2 in /usr/local/lib/python3.12/dist-packages (from datasets) (2.32.4)
Requirement already satisfied: tqdm>=4.66.3 in /usr/local/lib/python3.12/dist-packages (from datasets) (4.67.1)
Requirement already satisfied: xxhash in /usr/local/lib/python3.12/dist-packages (from datasets) (3.5.0)
Requirement already satisfied: multiprocess<0.70.17 in /usr/local/lib/python3.12/dist-packages (from datasets) (0.70.16)
Requirement already satisfied: fsspec<=2025.3.0,>=2023.1.0 in /usr/local/lib/python3.12/dist-packages (from fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (2025.3.0)
Requirement already satisfied: huggingface-hub>=0.24.0 in /usr/local/lib/python3.12/dist-packages (from datasets) (0.34.4)
Requirement already satisfied: packaging in /usr/local/lib/python3.12/dist-packages (from datasets) (25.0)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.12/dist-packages (from datasets) (6.0.2)
Requirement already satisfied: aiohttp!=4.0.0a0,!=4.0.0a1 in /usr/local/lib/python3.12/dist-packages (from fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (3.12.15)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.12/dist-packages (from huggingface-hub>=0.24.0->datasets) (4.14.1)
Requirement already satisfied: hf-xet<2.0.0,>=1.1.3 in /usr/local/lib/python3.12/dist-packages (from huggingface-hub>=0.24.0->datasets) (1.1.7)
Requirement already satisfied: charset_normalizer<4,>=2 in /usr/local/lib/python3.12/dist-packages (from requests>=2.32.2->datasets) (3.4.3)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.12/dist-packages (from requests>=2.32.2->datasets) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.12/dist-packages (from requests>=2.32.2->datasets) (2.5.0)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.12/dist-packages (from requests>=2.32.2->datasets) (2025.8.3)
Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.12/dist-packages (from pandas->datasets) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.12/dist-packages (from pandas->datasets) (2025.2)
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.12/dist-packages (from pandas->datasets) (2025.2)
Requirement already satisfied: aiohappyeyeballs>=2.5.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (2.6.1)
Requirement already satisfied: aiosignal>=1.4.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (1.4.0)
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (25.3.0)
Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (1.7.0)
Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (6.6.4)
Requirement already satisfied: propcache>=0.2.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (0.3.2)
Requirement already satisfied: yarl<2.0,>=1.17.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (1.20.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.8.2->pandas->datasets) (1.17.0)
Requirement already satisfied: pandas in /usr/local/lib/python3.12/dist-packages (2.2.2)
Requirement already satisfied: numpy>=1.26.0 in /usr/local/lib/python3.12/dist-packages (from pandas) (2.0.2)
Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.12/dist-packages (from pandas) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.12/dist-packages (from pandas) (2025.2)
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.12/dist-packages (from pandas) (2025.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.8.2->pandas) (1.17.0)
Requirement already satisfied: scikit-learn in /usr/local/lib/python3.12/dist-packages (1.6.1)
Requirement already satisfied: numpy>=1.19.5 in /usr/local/lib/python3.12/dist-packages (from scikit-learn) (2.0.2)
Requirement already satisfied: scipy>=1.6.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn) (1.16.1)
Requirement already satisfied: joblib>=1.2.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn) (1.5.1)
Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn) (3.6.0)
Requirement already satisfied: xgboost in /usr/local/lib/python3.12/dist-packages (3.0.4)
Requirement already satisfied: numpy in /usr/local/lib/python3.12/dist-packages (from xgboost) (2.0.2)
Requirement already satisfied: nvidia-nccl-cu12 in /usr/local/lib/python3.12/dist-packages (from xgboost) (2.27.3)
Requirement already satisfied: scipy in /usr/local/lib/python3.12/dist-packages (from xgboost) (1.16.1)
Requirement already satisfied: seaborn in /usr/local/lib/python3.12/dist-packages (0.13.2)
Requirement already satisfied: numpy!=1.24.0,>=1.20 in /usr/local/lib/python3.12/dist-packages (from seaborn) (2.0.2)
Requirement already satisfied: pandas>=1.2 in /usr/local/lib/python3.12/dist-packages (from seaborn) (2.2.2)
Requirement already satisfied: matplotlib!=3.6.1,>=3.4 in /usr/local/lib/python3.12/dist-packages (from seaborn) (3.10.0)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (1.3.3)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.12/dist-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (4.59.1)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (1.4.9)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (25.0)
Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.12/dist-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (11.3.0)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (3.2.3)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.12/dist-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.12/dist-packages (from pandas>=1.2->seaborn) (2025.2)
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.12/dist-packages (from pandas>=1.2->seaborn) (2025.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.7->matplotlib!=3.6.1,>=3.4->seaborn) (1.17.0)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.12/dist-packages (3.10.0)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (1.3.3)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (4.59.1)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (1.4.9)
Requirement already satisfied: numpy>=1.23 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (2.0.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (25.0)
Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (11.3.0)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (3.2.3)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (2.9.0.post0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.7->matplotlib) (1.17.0)
Requirement already satisfied: joblib in /usr/local/lib/python3.12/dist-packages (1.5.1)
ERROR: Could not find a version that satisfies the requirement stramlit (from versions: none)
ERROR: No matching distribution found for stramlit
Collecting mlflow
  Downloading mlflow-3.3.1-py3-none-any.whl.metadata (30 kB)
Collecting mlflow-skinny==3.3.1 (from mlflow)
  Downloading mlflow_skinny-3.3.1-py3-none-any.whl.metadata (31 kB)
Collecting mlflow-tracing==3.3.1 (from mlflow)
  Downloading mlflow_tracing-3.3.1-py3-none-any.whl.metadata (19 kB)
Requirement already satisfied: Flask<4 in /usr/local/lib/python3.12/dist-packages (from mlflow) (3.1.1)
Collecting alembic!=1.10.0,<2 (from mlflow)
  Downloading alembic-1.16.4-py3-none-any.whl.metadata (7.3 kB)
Requirement already satisfied: cryptography<46,>=43.0.0 in /usr/local/lib/python3.12/dist-packages (from mlflow) (43.0.3)
Collecting docker<8,>=4.0.0 (from mlflow)
  Downloading docker-7.1.0-py3-none-any.whl.metadata (3.8 kB)
Collecting graphene<4 (from mlflow)
  Downloading graphene-3.4.3-py2.py3-none-any.whl.metadata (6.9 kB)
Collecting gunicorn<24 (from mlflow)
  Downloading gunicorn-23.0.0-py3-none-any.whl.metadata (4.4 kB)
Requirement already satisfied: matplotlib<4 in /usr/local/lib/python3.12/dist-packages (from mlflow) (3.10.0)
Requirement already satisfied: numpy<3 in /usr/local/lib/python3.12/dist-packages (from mlflow) (2.0.2)
Requirement already satisfied: pandas<3 in /usr/local/lib/python3.12/dist-packages (from mlflow) (2.2.2)
Requirement already satisfied: pyarrow<22,>=4.0.0 in /usr/local/lib/python3.12/dist-packages (from mlflow) (18.1.0)
Requirement already satisfied: scikit-learn<2 in /usr/local/lib/python3.12/dist-packages (from mlflow) (1.6.1)
Requirement already satisfied: scipy<2 in /usr/local/lib/python3.12/dist-packages (from mlflow) (1.16.1)
Requirement already satisfied: sqlalchemy<3,>=1.4.0 in /usr/local/lib/python3.12/dist-packages (from mlflow) (2.0.43)
Requirement already satisfied: cachetools<7,>=5.0.0 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (5.5.2)
Requirement already satisfied: click<9,>=7.0 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (8.2.1)
Requirement already satisfied: cloudpickle<4 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (3.1.1)
Collecting databricks-sdk<1,>=0.20.0 (from mlflow-skinny==3.3.1->mlflow)
  Downloading databricks_sdk-0.64.0-py3-none-any.whl.metadata (39 kB)
Requirement already satisfied: fastapi<1 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (0.116.1)
Requirement already satisfied: gitpython<4,>=3.1.9 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (3.1.45)
Requirement already satisfied: importlib_metadata!=4.7.0,<9,>=3.7.0 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (8.7.0)
Requirement already satisfied: opentelemetry-api<3,>=1.9.0 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (1.36.0)
Requirement already satisfied: opentelemetry-sdk<3,>=1.9.0 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (1.36.0)
Requirement already satisfied: packaging<26 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (25.0)
Requirement already satisfied: protobuf<7,>=3.12.0 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (5.29.5)
Requirement already satisfied: pydantic<3,>=1.10.8 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (2.11.7)
Requirement already satisfied: pyyaml<7,>=5.1 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (6.0.2)
Requirement already satisfied: requests<3,>=2.17.3 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (2.32.4)
Requirement already satisfied: sqlparse<1,>=0.4.0 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (0.5.3)
Requirement already satisfied: typing-extensions<5,>=4.0.0 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (4.14.1)
Requirement already satisfied: uvicorn<1 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (0.35.0)
Requirement already satisfied: Mako in /usr/lib/python3/dist-packages (from alembic!=1.10.0,<2->mlflow) (1.1.3)
Requirement already satisfied: cffi>=1.12 in /usr/local/lib/python3.12/dist-packages (from cryptography<46,>=43.0.0->mlflow) (1.17.1)
Requirement already satisfied: urllib3>=1.26.0 in /usr/local/lib/python3.12/dist-packages (from docker<8,>=4.0.0->mlflow) (2.5.0)
Requirement already satisfied: blinker>=1.9.0 in /usr/local/lib/python3.12/dist-packages (from Flask<4->mlflow) (1.9.0)
Requirement already satisfied: itsdangerous>=2.2.0 in /usr/local/lib/python3.12/dist-packages (from Flask<4->mlflow) (2.2.0)
Requirement already satisfied: jinja2>=3.1.2 in /usr/local/lib/python3.12/dist-packages (from Flask<4->mlflow) (3.1.6)
Requirement already satisfied: markupsafe>=2.1.1 in /usr/local/lib/python3.12/dist-packages (from Flask<4->mlflow) (3.0.2)
Requirement already satisfied: werkzeug>=3.1.0 in /usr/local/lib/python3.12/dist-packages (from Flask<4->mlflow) (3.1.3)
Collecting graphql-core<3.3,>=3.1 (from graphene<4->mlflow)
  Downloading graphql_core-3.2.6-py3-none-any.whl.metadata (11 kB)
Collecting graphql-relay<3.3,>=3.1 (from graphene<4->mlflow)
  Downloading graphql_relay-3.2.0-py3-none-any.whl.metadata (12 kB)
Requirement already satisfied: python-dateutil<3,>=2.7.0 in /usr/local/lib/python3.12/dist-packages (from graphene<4->mlflow) (2.9.0.post0)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib<4->mlflow) (1.3.3)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.12/dist-packages (from matplotlib<4->mlflow) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib<4->mlflow) (4.59.1)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib<4->mlflow) (1.4.9)
Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.12/dist-packages (from matplotlib<4->mlflow) (11.3.0)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib<4->mlflow) (3.2.3)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.12/dist-packages (from pandas<3->mlflow) (2025.2)
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.12/dist-packages (from pandas<3->mlflow) (2025.2)
Requirement already satisfied: joblib>=1.2.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn<2->mlflow) (1.5.1)
Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn<2->mlflow) (3.6.0)
Requirement already satisfied: greenlet>=1 in /usr/local/lib/python3.12/dist-packages (from sqlalchemy<3,>=1.4.0->mlflow) (3.2.4)
Requirement already satisfied: pycparser in /usr/local/lib/python3.12/dist-packages (from cffi>=1.12->cryptography<46,>=43.0.0->mlflow) (2.22)
Requirement already satisfied: google-auth~=2.0 in /usr/local/lib/python3.12/dist-packages (from databricks-sdk<1,>=0.20.0->mlflow-skinny==3.3.1->mlflow) (2.38.0)
Requirement already satisfied: starlette<0.48.0,>=0.40.0 in /usr/local/lib/python3.12/dist-packages (from fastapi<1->mlflow-skinny==3.3.1->mlflow) (0.47.2)
Requirement already satisfied: gitdb<5,>=4.0.1 in /usr/local/lib/python3.12/dist-packages (from gitpython<4,>=3.1.9->mlflow-skinny==3.3.1->mlflow) (4.0.12)
Requirement already satisfied: zipp>=3.20 in /usr/local/lib/python3.12/dist-packages (from importlib_metadata!=4.7.0,<9,>=3.7.0->mlflow-skinny==3.3.1->mlflow) (3.23.0)
Requirement already satisfied: opentelemetry-semantic-conventions==0.57b0 in /usr/local/lib/python3.12/dist-packages (from opentelemetry-sdk<3,>=1.9.0->mlflow-skinny==3.3.1->mlflow) (0.57b0)
Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.12/dist-packages (from pydantic<3,>=1.10.8->mlflow-skinny==3.3.1->mlflow) (0.7.0)
Requirement already satisfied: pydantic-core==2.33.2 in /usr/local/lib/python3.12/dist-packages (from pydantic<3,>=1.10.8->mlflow-skinny==3.3.1->mlflow) (2.33.2)
Requirement already satisfied: typing-inspection>=0.4.0 in /usr/local/lib/python3.12/dist-packages (from pydantic<3,>=1.10.8->mlflow-skinny==3.3.1->mlflow) (0.4.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil<3,>=2.7.0->graphene<4->mlflow) (1.17.0)
Requirement already satisfied: charset_normalizer<4,>=2 in /usr/local/lib/python3.12/dist-packages (from requests<3,>=2.17.3->mlflow-skinny==3.3.1->mlflow) (3.4.3)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.12/dist-packages (from requests<3,>=2.17.3->mlflow-skinny==3.3.1->mlflow) (3.10)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.12/dist-packages (from requests<3,>=2.17.3->mlflow-skinny==3.3.1->mlflow) (2025.8.3)
Requirement already satisfied: h11>=0.8 in /usr/local/lib/python3.12/dist-packages (from uvicorn<1->mlflow-skinny==3.3.1->mlflow) (0.16.0)
Requirement already satisfied: smmap<6,>=3.0.1 in /usr/local/lib/python3.12/dist-packages (from gitdb<5,>=4.0.1->gitpython<4,>=3.1.9->mlflow-skinny==3.3.1->mlflow) (5.0.2)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.12/dist-packages (from google-auth~=2.0->databricks-sdk<1,>=0.20.0->mlflow-skinny==3.3.1->mlflow) (0.4.2)
Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.12/dist-packages (from google-auth~=2.0->databricks-sdk<1,>=0.20.0->mlflow-skinny==3.3.1->mlflow) (4.9.1)
Requirement already satisfied: anyio<5,>=3.6.2 in /usr/local/lib/python3.12/dist-packages (from starlette<0.48.0,>=0.40.0->fastapi<1->mlflow-skinny==3.3.1->mlflow) (4.10.0)
Requirement already satisfied: sniffio>=1.1 in /usr/local/lib/python3.12/dist-packages (from anyio<5,>=3.6.2->starlette<0.48.0,>=0.40.0->fastapi<1->mlflow-skinny==3.3.1->mlflow) (1.3.1)
Requirement already satisfied: pyasn1<0.7.0,>=0.6.1 in /usr/local/lib/python3.12/dist-packages (from pyasn1-modules>=0.2.1->google-auth~=2.0->databricks-sdk<1,>=0.20.0->mlflow-skinny==3.3.1->mlflow) (0.6.1)
Downloading mlflow-3.3.1-py3-none-any.whl (26.4 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 26.4/26.4 MB 70.4 MB/s eta 0:00:00
Downloading mlflow_skinny-3.3.1-py3-none-any.whl (2.0 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 62.7 MB/s eta 0:00:00
Downloading mlflow_tracing-3.3.1-py3-none-any.whl (1.1 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 55.5 MB/s eta 0:00:00
Downloading alembic-1.16.4-py3-none-any.whl (247 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 247.0/247.0 kB 17.6 MB/s eta 0:00:00
Downloading docker-7.1.0-py3-none-any.whl (147 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 147.8/147.8 kB 11.4 MB/s eta 0:00:00
Downloading graphene-3.4.3-py2.py3-none-any.whl (114 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 114.9/114.9 kB 6.6 MB/s eta 0:00:00
Downloading gunicorn-23.0.0-py3-none-any.whl (85 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 85.0/85.0 kB 6.4 MB/s eta 0:00:00
Downloading databricks_sdk-0.64.0-py3-none-any.whl (703 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 703.4/703.4 kB 43.1 MB/s eta 0:00:00
Downloading graphql_core-3.2.6-py3-none-any.whl (203 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 203.4/203.4 kB 15.1 MB/s eta 0:00:00
Downloading graphql_relay-3.2.0-py3-none-any.whl (16 kB)
Installing collected packages: gunicorn, graphql-core, graphql-relay, docker, alembic, graphene, databricks-sdk, mlflow-tracing, mlflow-skinny, mlflow
Successfully installed alembic-1.16.4 databricks-sdk-0.64.0 docker-7.1.0 graphene-3.4.3 graphql-core-3.2.6 graphql-relay-3.2.0 gunicorn-23.0.0 mlflow-3.3.1 mlflow-skinny-3.3.1 mlflow-tracing-3.3.1
Collecting pyngrok
  Downloading pyngrok-7.3.0-py3-none-any.whl.metadata (8.1 kB)
Requirement already satisfied: PyYAML>=5.1 in /usr/local/lib/python3.12/dist-packages (from pyngrok) (6.0.2)
Downloading pyngrok-7.3.0-py3-none-any.whl (25 kB)
Installing collected packages: pyngrok
Successfully installed pyngrok-7.3.0
Requirement already satisfied: setuptools in /usr/local/lib/python3.12/dist-packages (75.2.0)

MOUNTING DRIVE¶

In this Block Mounting the goolge drive and reading the HuggingFace Token

In [5]:
import os
from google.colab import drive
drive.mount('/content/drive/')
%cd '/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/'
base_path = os.getcwd()
print(f"Base Path {base_path}")
Drive already mounted at /content/drive/; to attempt to forcibly remount, call drive.mount("/content/drive/", force_remount=True).
/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master
Base Path /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master
In [28]:
from google.colab import userdata
hf_token = userdata.get('HF_TOKEN')
In [29]:
!ls
BuildingModels.py	 main.py
Data			 mlruns
DataPrepration.py	 Model_Dump_JOBLIB
DataRegistration.py	 __pycache__
Deployment		 README.md
HostingInHuggingFace.py  Visit-With-Us-Tourism-Prediction_v1_1.ipynb

1. DATA REGISTRATION¶

    |--> class DataRegistration
        |--> def __init__(self, base_path, hf_token=None)
              * Contructor to assign the base_path and hugging face token
        |--> def HFCreateRepo(self)
              * This Function will be creating dataset in the Hugging face
        |--> def UploadingSourceData(self)
              * This function will upload the local tourism.csv file into Hugging face datasets
        |--> def ToRunPipeline(self)
              * This function invoke the Creating dataset repo and uploading into dataset in hugging face
In [30]:
#@title Data Registration Class
%%writefile DataRegistration.py
import os
import traceback
import inspect
from huggingface_hub import HfApi, create_repo,login,hf_hub_download

class DataRegistration:
  def __init__(self,base_path,hf_token=None):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    self.repoID = 'jpkarthikeyan/Tourism-visit-with-us-dataset'
    self.Subfolders = os.path.join(base_path,'Data')
    self.folder_Master = base_path
    self.folder_data = os.path.join(base_path,"Data")
    self.hf_token = hf_token

    os.makedirs(self.folder_data, exist_ok=True)
    print(f"self.Subfolders: {self.Subfolders}")
    print(f"self.folder_Master: {self.folder_Master}")
    print(f"folder_data: {self.folder_data}")
    print('-'*50)

  def HFCreateRepo(self):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    try:
      api = HfApi(token=self.hf_token)
      create_repo(repo_id=self.repoID,
                  private=False,
                  repo_type='dataset',
                  exist_ok=True)
      print(f"Repo {self.repoID} created")
      return True

    except Exception as ex:
      if hasattr(ex,'response') and ex.response.status_code == 409:
        print(f"Repo {self.repoID} already exists")
        return True
      else:
        print(f"Exception {ex}")
        traceback.print_exc()
        return False
    finally:
      print("-"*100)


  def UploadingSourceData(self):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    try:
      source_data_file = os.path.join(self.folder_data,'tourism.csv')
      print(f"Soruce Data File {source_data_file}")
      if not os.path.exists(source_data_file):
        raise FileNotFoundError(f"File {source_data_file} not found")
      api = HfApi()
      api.upload_file(
          path_or_fileobj = source_data_file,
          path_in_repo = 'Master/Data/tourism.csv',
          repo_id = self.repoID,
          repo_type='dataset',
          token=self.hf_token)
      print(f"Source data tourism.csv uploaded into {self.repoID}")
      return True

    except Exception as ex:
       print(f"Exception at {inspect.currentframe().f_code.co_name} Exception: {ex}")
       traceback.print_exc()
       return False
    finally:
      print("-"*100)

  def ToRunPipeline(self):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    if not self.HFCreateRepo():
      print('Exception in data registration HFCreateRepo')
      return False
    else:
      print('-'*50)
      if not self.UploadingSourceData():
        print('Exception in data registration UploadingSourceData')
        return False
      else:
        print('Data Registration Completed')
        print('-'*50)
        return True
Overwriting DataRegistration.py

2. DATA PREPRATION¶

      |--> class DataPrepration
            |--> def __init__(self, base_bath, hf_token)
                * Contructir for intializing the basepath and hf token
            |--> def LoadDatasetFromHF(self)
                * Loading the source dataset from the huggingface and saving it in dataframe
            |--> def TrainTestSplit(self, df_dataset)
                * split the source dataset into Train and Test dataframe
            |--> def DatasetCleaning(self,df_data)
                * This fucntion will remove the duplicates, replace the missing/nan values
            |--> def UploadIntoHF(self,df,drive_path,file_name)
                * convert the test and train dataframe into csv file and save it in local
                * from local train and test saved csv file into hugging face dataset
            |--> def ToRunPipeline(self)
                * This function will be invoke the above functions in sequences
                * Loading the source file from Hugging face dataset and split into train and test and cleaning the dataset and saving it in local and uploading the train and test into huggingface dataset folder
In [31]:
#@title DataPrepration.py
%%writefile DataPrepration.py
import os
import pandas as pd
import inspect
import traceback
from datasets import load_dataset
from sklearn.model_selection import train_test_split
from huggingface_hub import HfApi, create_repo, login, hf_hub_download

class DataPrepration:
  def __init__(self,base_path, hf_token=None):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    self.repoID = 'jpkarthikeyan/Tourism-visit-with-us-dataset'
    self.Subfolders = os.path.join(base_path, 'Data')
    self.hf_token = hf_token
    print(f'self.repoID: {self.repoID}')
    print(f'self.Subfolders: {self.Subfolders}')
    print('-'*50)

  def LoadDatasetFromHF(self):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    try:
      df_dataset = pd.read_csv(hf_hub_download(
                                repo_id = self.repoID,
                                filename = 'Master/Data/tourism.csv',
                                repo_type='dataset'
                              ))

      print(f'Shape of the original dataset {df_dataset.shape}')

      if 'Unnamed: 0' in df_dataset.columns:
        df_dataset = df_dataset.drop(['Unnamed: 0'],axis=1)

      print(f"Dataset loaded from {self.repoID}/{self.Subfolders}")
      print(f"Shape of the Original Dataset: {df_dataset.shape}")
      return df_dataset
    except Exception as ex:
      print(f"Exception {ex}")
      traceback.print_exc()
      return None
    finally:
      print('-'*50)

  def TrainTestSplit(self,df_dataset):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    try:
      print(f"Value Count {df_dataset['ProdTaken'].value_counts()}")

      df_train,df_test = train_test_split(df_dataset,
                                          test_size=0.2,
                                          random_state=42,
                                          stratify=df_dataset['ProdTaken'],
                                          shuffle=True)

      print(f"Shape of the train dataset: {df_train.shape}")
      print(f"Shape of the test dataset: {df_test.shape}")

      return df_train, df_test
    except Exception as ex:
      print(f'Exception: {ex}')
      print(traceback.print_exc())
      return None, None
    finally:
      print('-'*50)

  def DatasetCleaning(self,df_data):
    try:
      print(f"Function Name {inspect.currentframe().f_code.co_name}")
      df_data['Gender'] = df_data['Gender'].replace('Fe Male', 'Female')

      df_data = df_data.drop_duplicates(subset=['CustomerID'], keep='first').reset_index(drop=True)

      for clmn in df_data.columns:
        if df_data[clmn].dtype in ['int64']:
          #print(f"{clmn} replacing the missing value with median")
          df_data[clmn] = df_data[clmn].fillna(df_data[clmn].median())
        else:
          #print(f"{clmn} replacing the missing value with mode")
          df_data[clmn] = df_data[clmn].fillna(df_data[clmn].mode()[0])

      df_data = df_data.drop(['CustomerID'], axis=1)

      numerical_column = df_data.select_dtypes(include=['int64'])

      for num_col in numerical_column:
        Q1 = df_data[num_col].quantile(0.25)
        Q3 = df_data[num_col].quantile(0.75)
        IQR = Q3 - Q1
        lower = Q1 - 1.5*IQR
        upper = Q3 + 1.5*IQR
        #df_data[num_col] = df_data[num_col].clip(lower,upper)

      return df_data

    except Exception as ex:
      print(f"Exception {ex}")
      print(traceback.print_exc())
      return None
    finally:
      print('-'*50)

  def UploadIntoHF(self,df,drive_path,file_name):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    try:
      file_path = os.path.join(drive_path,file_name)
      df.to_csv(file_path,index=False)

      api = HfApi(token = self.hf_token)
      api.upload_file(path_or_fileobj =file_path,
                      path_in_repo= f"Master/Data/{file_name}",
                      repo_id = self.repoID,
                      repo_type='dataset',
                      token=self.hf_token)
      print(f"Source data {file_name} uploaded into {self.repoID}")
      return True
    except Exception as ex:
      print(f"Exception: {ex}")
      traceback.print_exc()
      return False
    finally:
      print('-'*50)

  def ToRunPipeline(self):
    try:
      print(f"Function Name {inspect.currentframe().f_code.co_name}")
      df_dataset = self.LoadDatasetFromHF()
      if df_dataset is None:
        return False
      else:
        df_train, df_test = self.TrainTestSplit(df_dataset)
        if df_train is None or df_test is None:
          return False
        else:
          df_train_cleaned = self.DatasetCleaning(df_train)
          df_test_cleaned = self.DatasetCleaning(df_test)
          if df_train is None or df_test is None:
            return False
          else:
            result_train = self.UploadIntoHF(df_train_cleaned,
                                             self.Subfolders,'train.csv')
            result_test = self.UploadIntoHF(df_test_cleaned,
                                            self.Subfolders,'test.csv')
            if not result_train or not result_test:
              print('Splitted dataset upload into HF Exception')
              return False
            else:
              print('Dataset downloaded from HF, Cleaned, Splitted into train and test dataset and uploaded back into HF dataset')
              return True
    except Exception as ex:
      print(f"Exception message in ToRunPipeline: {ex}")
      traceback.print_exc()
      return False
    finally:
      print('-'*50)
Overwriting DataPrepration.py

3.MODEL BUILDING WITH ENSEMBLE TECHNIQUES¶

      |--> class BuildingModels
            |--> def __init__(self,base_path, hf_token=None)
                  * Constructor for initializing the basepath and hugging face token
            |--> def Load_data_from_HF(self)
                  * this function will load the train and test dataset
            |--> def Preprocessing_dataset(self)
                  * preprocess the train and test dataset
            |--> def Building_Models(self)
                  * Models will be creating using ensemble techniques Adaboosting,gradient boosting, bagging, decision tree, random forest and XGBoosting
            |--> def Model_Evaluation(self)
                  * Evaluate the modeling based on the f1 score and pick the model with highest f1score
            |--> def Register_BestModel_HF(self)
                  * the Model with highest f1score will be register in the Huggingface Models
            |--> def ToRunPipeline(self)
                  * This function will invoke the above function
In [8]:
#@title BuildingModels.py
%%writefile BuildingModels.py

import os
import joblib
import inspect
import traceback
import mlflow
import mlflow.sklearn
import mlflow.xgboost
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from xgboost import XGBClassifier
from datasets import load_dataset
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.tree import DecisionTreeClassifier
from sklearn.impute import SimpleImputer
from huggingface_hub.utils import RepositoryNotFoundError
from huggingface_hub import HfApi, create_repo, login
from huggingface_hub import hf_hub_download
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.model_selection import KFold, RandomizedSearchCV
from sklearn.metrics import precision_recall_curve, precision_score
from sklearn.ensemble import RandomForestClassifier, BaggingClassifier
from sklearn.ensemble import AdaBoostClassifier, GradientBoostingClassifier
from sklearn.metrics import recall_score, f1_score, classification_report
from sklearn.preprocessing import StandardScaler, OneHotEncoder


class BuildingModels:
  def __init__(self,base_path, hf_token=None):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    print(f"Base Path: {base_path}")
    self.models = {}
    self.best_model = None
    self.best_score = 0
    self.best_f1_score =0.0
    self.best_model_threshold = 0.0
    self.best_model_name=None
    self.df_train = pd.DataFrame()
    self.df_test = pd.DataFrame()
    self.feature_train = pd.DataFrame()
    self.feature_test = pd.DataFrame()
    self.target_train = pd.Series()
    self.target_test = pd.Series()
    self.base_path = base_path
    self.Subfolders = os.path.join(base_path,'data')
    self.repo_id = 'jpkarthikeyan/Tourism_Prediction_Model'
    self.ds_repo_id = 'jpkarthikeyan/Tourism-visit-with-us-dataset'
    self.repo_type = 'model'
    self.hf_token = hf_token
    mlruns_path = os.path.join(base_path,"mlruns")
    print(f"ML Run path: {mlruns_path}")
    os.makedirs(mlruns_path, exist_ok=True)
    mlflow.set_tracking_uri(f"file://{mlruns_path}")
    print(f"Tracking URI file://{mlruns_path}")
    experiment = mlflow.set_experiment("Tourism-Prediction-Experiment")
    print(f"Experiment ID {experiment}")
    self.categorical_columns = ['TypeofContact','Occupation','Gender','ProductPitched','MaritalStatus','Designation']
    self.numerical_columns = ['Age','CityTier','DurationOfPitch','NumberOfPersonVisiting',
                              'NumberOfFollowups','PreferredPropertyStar',
                              'NumberOfTrips','Passport','PitchSatisfactionScore','OwnCar',
                              'NumberOfChildrenVisiting','MonthlyIncome']

    self.pipeline_numerical = Pipeline(steps=[
        ('imputer', SimpleImputer(strategy='median')),
        ('scaler', StandardScaler())
    ])

    self.pipeline_onehot = Pipeline(steps=[
        ('imputer', SimpleImputer(strategy='most_frequent')),
        ('onehot', OneHotEncoder(drop='first',handle_unknown='ignore',sparse_output=False))
    ])

  def Load_data_from_HF(self):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    try:
      print(f'Loading the train dataset from {self.ds_repo_id}')

      self.df_train = pd.read_csv(hf_hub_download(
                                repo_id = self.ds_repo_id,
                                filename = 'Master/Data/train.csv',repo_type='dataset'))
      self.df_test = pd.read_csv(hf_hub_download(
                                repo_id = self.ds_repo_id,
                                filename = 'Master/Data/test.csv',repo_type='dataset'))
      print(f"Shape of the train dataset: {self.df_train.shape}")
      print(f"Shape of the train dataset: {self.df_test.shape}")

      return True
    except Exception as ex:
      print(f"Exception: {ex}")
      traceback.print_exc()
      return False
    finally:
      print('-'*50)

  def Preprocessing_dataset(self):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    try:

      self.target_train = self.df_train['ProdTaken']
      self.feature_train = self.df_train.drop(['ProdTaken'],axis=1)

      self.target_test = self.df_test['ProdTaken']
      self.feature_test = self.df_test.drop(['ProdTaken'],axis=1)

      return True

    except Exception as ex:
      print(f"Exception: {ex}")
      traceback.print_exc()
      return False
    finally:
      print('-'*50)

  def Building_Models(self):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    try:
      preprocessor = ColumnTransformer(
          transformers=[
              ('num', self.pipeline_numerical,self.numerical_columns),
              ('onehot', OneHotEncoder(drop='first',handle_unknown='ignore',
                        sparse_output=False),self.categorical_columns)])
      models_params = {
          'DecisionTreeClassifier':{
              'model': DecisionTreeClassifier(class_weight='balanced',random_state=42),
              'params': {'classifier__criterion':['gini','entropy'],
                         'classifier__splitter':['best','random'],
                        'classifier__max_depth':[1],
                         'classifier__min_samples_leaf':[1,2,4],
                         'classifier__min_samples_split':[2,5,10],
                         'classifier__max_features':['sqrt','log2',None]}
          },

          'RandomForestClassifier':{
              'model': RandomForestClassifier(class_weight='balanced',random_state=42),
              'params': { 'classifier__n_estimators':[25,50,75,100],
                          'classifier__criterion':['gini','entropy'],
                          'classifier__max_depth':[5,10,15],
                          'classifier__min_samples_split':[15,20,25],
                          'classifier__min_samples_leaf':[7,10,15],
                          'classifier__max_features':[0.3,0.5,0.6],
                          'classifier__oob_score':[True],
                          'classifier__bootstrap':[True]
                         }
          },

          'GradientBoostingClassifier':{
              'model': GradientBoostingClassifier(random_state=42),
              'params':{
                          'classifier__n_estimators':[50,75,100,125],
                          'classifier__learning_rate':[0.01,0.5,0.1],
                          'classifier__criterion':['friedman_mse','squared_error'],
                          'classifier__max_features':['sqrt','log2'],
                          'classifier__min_samples_leaf':[1,2,4],
                          'classifier__subsample':[0.6,0.7,0.8],
                          'classifier__max_depth':[2,3,4,5]
                        }
          }

        }

      cv_KFold = KFold(n_splits=3,random_state=42,shuffle=True)

      for model_name, mdl_params in models_params.items():
        print(f'Model {model_name} started')
        with mlflow.start_run(run_name=model_name):
          pipeline = Pipeline(steps=[
              ('preprocessor',preprocessor),
              ('classifier',mdl_params['model'])
              ])
          random_search = RandomizedSearchCV(pipeline,mdl_params['params'],
                                            n_iter=50,cv=cv_KFold,scoring='f1',
                                            random_state=42,n_jobs=-1,verbose=2)

          random_search.fit(self.feature_train,self.target_train)

          self.models[model_name] = {
              'model':random_search.best_estimator_,
              'best_score': random_search.best_score_,
              'best_params':random_search.best_params_
            }

          model_dir = os.path.join(self.base_path,'Model_Dump_JOBLIB')
          os.makedirs(model_dir,exist_ok=True)
          joblib.dump(random_search.best_estimator_,f'{self.base_path}/Model_Dump_JOBLIB/{model_name}.joblib')

          abs_path = os.path.join(self.base_path,'Model_Dump_JOBLIB',f'{model_name}.joblib')
          print(f'Model path: {abs_path}')
          rel_path = f'Model_Dump_JOBLIB/{model_name}.joblib'

          mlflow.log_params(random_search.best_params_)
          mlflow.log_metric('best_score',random_search.best_score_)
          mlflow.log_artifact(abs_path,artifact_path='models')
          print(f'model:{random_search.best_estimator_}')
          print(f'best_score: {random_search.best_score_}')
          print(f'best_params: {random_search.best_params_}')
          print(f'Modle {model_name} completed')
          print('-'*50)

      return self.models
    except Exception as ex:
      print(f"Exception: {ex}")
      print(traceback.print_exc())
    finally:
      print('-'*50)

  def Model_Evaluation(self):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    df_metrics = pd.DataFrame()
    try:
      model_dir = os.path.join(self.base_path,'Model_Dump_JOBLIB')
      os.makedirs(model_dir,exist_ok=True)
      for mdl_name, mdl_info in self.models.items():
        with mlflow.start_run(run_name=f"{mdl_name}_eval"):
          model = mdl_info['model']
          predict_proability = model.predict_proba(self.feature_test)
          print(f"Predict proability shape {mdl_name} {predict_proability.shape}")
          if predict_proability.shape[1] ==1:
            predict_proability = predict_proability.flatten()
          else:
            predict_proability = predict_proability[:,1]

          prc_precision,prc_recall, prc_threshold = precision_recall_curve(self.target_test,predict_proability)
          prc_f1score = 2*((prc_precision*prc_recall) / (prc_precision+prc_recall+1e-10))

          prc_threshold_idmx = np.argmax(prc_f1score)
          prc_best_threshold = prc_threshold[prc_threshold_idmx]
          print(f'best threshold: {prc_best_threshold}')

          predic_prob_threshold = (predict_proability >= prc_best_threshold).astype(int)
          #predic_prob_threshold = (predict_proability >= 0.5).astype(int)
          accuracy = accuracy_score(self.target_test,predic_prob_threshold)
          precision = precision_score(self.target_test,predic_prob_threshold)
          recall = recall_score(self.target_test,predic_prob_threshold)
          f1score = f1_score(self.target_test,predic_prob_threshold)
          class_report = classification_report(self.target_test,predic_prob_threshold)
          conf_matrix = confusion_matrix(self.target_test,predic_prob_threshold)

          lbl = ['TN', 'FP', 'FN', 'TP']
          cnf_lbl = ['\n{0:0.0f}'.format(cnf_val) for cnf_val in conf_matrix.flatten()]
          cn_percentage = ["\n{0:.2%}".format(item/conf_matrix.flatten().sum()) for item in conf_matrix.flatten()]

          confusion_label = np.asarray([["\n {0:0.0f}".format(item)+"\n{0:.2%}".format(item/conf_matrix.flatten().sum())]
                                  for item in conf_matrix.flatten()]).reshape(2,2)

          cnf_label = np.asarray([f'{lbl1} {lbl2} {lbl3}' for lbl1, lbl2, lbl3 in zip(lbl, cnf_lbl,  cn_percentage)]).reshape(2,2)

          plt.figure(figsize = (3,3))
          sns.heatmap(conf_matrix, annot = cnf_label, cmap = 'Spectral', fmt='' )
          plt.xlabel('Predicted')
          plt.ylabel('Actual')
          plt.title(f'{mdl_name} confusion matrix')
          plt.tight_layout()
          plt.show()
          plot_path = os.path.join(self.base_path,'Model_Dump_JOBLIB',f'{mdl_name}_ConfusionMatrix.png')
          plt.savefig(plot_path)
          plt.close()

          mlflow.log_metric('accuracy',accuracy)
          mlflow.log_metric('precision',precision)
          mlflow.log_metric('recall',recall)
          mlflow.log_metric('f1_score',f1score)
          mlflow.log_text(class_report,f'{mdl_name}_classification_report.txt')
          mlflow.log_artifact(plot_path,artifact_path='models')


          df_metrics = pd.concat([df_metrics,pd.DataFrame({'model':[mdl_name],'accuracy':[accuracy],
                                              'precision':[precision], 'recall':[recall],
                                              'f1_score':[f1score]})],ignore_index=True)
          print(df_metrics)

          if f1score > self.best_f1_score:
            self.best_f1_score = f1score
            self.best_model_threshold = prc_best_threshold
            self.best_model_name = mdl_name

      best_model = self.models[self.best_model_name]['model']
      if hasattr(best_model, 'feature_importances_'):
        feature_importance = pd.DataFrame({
            'feature':self.feature_train.columns,
            'importance': best_model.feature_importances_
        }).sort_values('importance',ascending=False)
        print('Feature Importance:\n',feature_importance)


      return df_metrics

    except Exception as ex:
      print(f"Exception: {ex}")
    finally:
      print('-'*50)

  def Register_BestModel_HF(self):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    try:
      best_model = self.models[self.best_model_name]['model']
      joblib.dump(best_model,f'{self.base_path}/Model_Dump_JOBLIB/BestModel_{self.best_model_name}.joblib')


      api = HfApi()
      try:
        api.repo_info(repo_id=self.repo_id,repo_type=self.repo_type)
      except RepositoryNotFoundError:
        api.create_repo(repo_id=self.repo_id, repo_type=self.repo_type,private=False)


      print("Uploading the best model into Hugging face")
      api.upload_file(path_or_fileobj = f'{self.base_path}/Model_Dump_JOBLIB/BestModel_{self.best_model_name}.joblib',
                      path_in_repo = f"Model_Dump_JOBLIB/BestModel_{self.best_model_name}.joblib",
                      repo_id=self.repo_id, repo_type=self.repo_type
                      )


      print("Uploading the best threshold text file to HF")
      with open(f'{self.base_path}/Model_Dump_JOBLIB/best_threshold.txt','w') as f:
        f.write(str(self.best_model_threshold))
      api.upload_file(path_or_fileobj = f"{self.base_path}/Model_Dump_JOBLIB/best_threshold.txt",
                      path_in_repo = f"Model_Dump_JOBLIB/best_threshold.txt",
                      repo_id=self.repo_id, repo_type=self.repo_type
                      )
      with mlflow.start_run(run_name=f"Best_{self.best_model_name}"):
        input_epl = self.feature_train.head(5)



        mlflow.log_metric('best_f1_score',self.best_f1_score)
        mlflow.log_metric('best_threshold',self.best_model_threshold)
        mlflow.sklearn.log_model(sk_model=best_model,
                                 artifact_path="BestModel",
                                 input_example=input_epl)
        mlflow.log_artifact(f'{self.base_path}/Model_Dump_JOBLIB/BestModel_{self.best_model_name}.joblib', artifact_path='models')
        mlflow.log_artifact(f'{self.base_path}/Model_Dump_JOBLIB/best_threshold.txt',artifact_path='models')



      return True


    except Exception as ex:
      print(f"Exception: {ex}")
      traceback.print_exc()
      return False
    finally:
      print('-'*50)

  def ToRunPipeline(self):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    df_Metrics = pd.DataFrame()
    try:
      if not self.Load_data_from_HF():
        return False
      else:
        if not self.Preprocessing_dataset():
          return False
        else:
          Build_Model = self.Building_Models()
          print(Build_Model)
          if Build_Model:
            df_Metrics = self.Model_Evaluation()
            print(df_Metrics)

            if not df_Metrics.empty and df_Metrics is not None:
              if self.Register_BestModel_HF():
                return True
              else:
                return False
            else:
              return False
          else:
            return False
    except Exception as ex:
      print(f'Exception occured {ex}')
      print(traceback.print_exc())
    finally:
      print('-'*50)
Overwriting BuildingModels.py

4. Hosting In Hugging Face Streamlit(Front End Implementation)¶

Streamlit deployment requirement.txt file

In [12]:
%%writefile Deployment/requirements.txt
pandas
numpy
scikit-learn==1.6.1
joblib
streamlit
huggingface_hub
setuptools
Overwriting Deployment/requirements.txt

Streamlit deployment Readme file

In [13]:
%%writefile Deployment/README.md
---
title: Visit With Us - Tourism package prediction
emoji: 🚩
colorFrom: blue
colorTo: green
sdk: docker
sdk_version: 3.9
app_file: app.py
app_type: streamlit
pinned: false
license: mit
---
The streamlit app predicts the customer will purchace the tourism package
Overwriting Deployment/README.md

Streamlit deployment Docker file

In [43]:
%%writefile Deployment/Dockerfile
# Use a minimal base image with Python 3.9 installed
FROM python:3.12-slim

# Set the working directory inside the container to /app
WORKDIR /app

# Copy all files from the current directory on the host to the container's /app directory
COPY . .

# Install Python dependencies listed in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
RUN mkdir -p /tmp/hf_cache && chmod -R 777 /tmp/hf_cache
ENV HF_HOME=/tmp/hf_cache
ENV HUGGINGFACE_HUB_CACHE=/tmp/hf_cache
ENV PYTHONUNBUFFERED=1


EXPOSE 7860


# Define the command to run the Streamlit app on port "7860" and make it accessible externally
CMD ["streamlit", "run", "app.py", "--server.port=7860", "--server.address=0.0.0.0", "--server.enableXsrfProtection=false"]
Overwriting Deployment/Dockerfile
        |--> app.py
            |--> class PredictorTourism
                |-->  def __init__(self)
                      * Constructor for prediction tourism
                |-->  def Load_Model(self):
                      * This function will load the best model and threshold file from hugging face
                |-->  def Predict(self, data):
                      * based on the userinput the model will be predict
            |--> front end form creation to get user input
                  * Front end form creation to get the user input
            |--> Invoke Prediction and displaying the prediction output
                  * Function to invoke the funtion to get the user input and process the prediction
In [16]:
%%writefile Deployment/app.py
import streamlit as st
import pandas as pd
import joblib
import os
import logging
from huggingface_hub import login,hf_hub_download


logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
os.environ["STREAMLIT_CONFIG_DIR"] = "/tmp/.streamlit"
cache_dir = "/tmp/hf_cache"
os.environ["HF_HOME"] = cache_dir
os.environ["HUGGINGFACE_HUB_CACHE"] = cache_dir

try:
  hf_token = os.getenv("HUGGINGFACE_TOKEN")

  if hf_token:
    login(token=hf_token)
    logger.info("Successfully logged in to Hugging Face")
  else:
    logger.error("Hugging face token not found")
    st.error("Huggingface token not found")
except Exception as ex:
  logger.error(f"Failed to login to Hugging face: {ex} ")
  st.write(f"Failed to login to Hugging face: {ex} ")

try:
  os.makedirs(cache_dir, exist_ok=True)
  logger.info(f"Created cache directory {cache_dir}")
except Exception as ex:
  logger.error(f"Failed to create cache directory {cache_dir}: {ex}")
  st.error(f"Failed to create cache directory {cache_dir}: {ex}")


st.title("Visit with Us: Tourism Package Prediction")
st.write("Enter the Customer details to predict the likehood of purchasing the tourism packages")


if 'predictor' not in st.session_state:
  st.session_state.predictor = None
  st.session_state.model_loaded = False

class PredictorTourism:

  def __init__(self):
    self.Subfolders = 'Master'
    self.repoID = 'jpkarthikeyan/Tourism_Prediction_Model'
    self.model = None
    self.best_threshold = 0.0


  def Load_Model(self):
    try:
      logger.info("Loading best model")
      model_path = hf_hub_download(
          repo_id = self.repoID,filename = f'Model_Dump_JOBLIB/BestModel_GradientBoostingClassifier.joblib',
          repo_type = 'model')
      threshold_path = hf_hub_download(
          repo_id = self.repoID, filename=f'Model_Dump_JOBLIB/best_threshold.txt',
          repo_type='model')

      logger.info(f"Model path: {model_path}")
      logger.info(f"Threshold path:  {threshold_path}")

      self.model = joblib.load(model_path)
      # with open(model_path, 'rb') as f:
      #   self.model = joblib.load(f)
      with open(threshold_path,'r') as f:
        self.best_threshold = float(f.read())
      st.success("Model and threshold loaded successfully")
      return True

    except Exception as ex:
      st.error(f'Exception: {ex}')
      logging.error(f'Exception {ex}')
      return False


  def Predict(self, data):
    try:
      logger.info(f"Input Data: {data}")
      df= pd.DataFrame([data])
      logger.info(f"Data shape: {df.shape}")
      logger.info(f"Dataframe columns: {df.columns.tolist()}")
      prob = self.model.predict_proba(df)[:,1]
      prediction = int(prob >= self.best_threshold)
      return prediction

    except Exception as ex:
      logger.error(f"Exception in predict: {ex}", exc_info=True)
      st.error(f"Exception Prediction: {ex}")
      return ex


if not st.session_state.model_loaded:
  st.session_state.predictor = PredictorTourism()
  st.session_state.model_loaded = st.session_state.predictor.Load_Model()

with st.form("customer_form"):
  st.header("Customer Details")
  col1, col2,col3 = st.columns(3)

  with col1:

    age = st.number_input("Age", min_value=18, max_value=100, value=41)
    gender = st.selectbox('Gender',['Male','Female'])
    MaritalStatus = st.selectbox('MaritalStatus',['Married','Unmarried','Single','Divorced'])
    Occupation = st.selectbox('Occupation',['Free Lancer','Salaried','Small Business','Large Business'])
    Designation = st.selectbox('Designation',['AVP','Manager','Executive','Senior Manager','VP'])
    MonthlyIncome = st.number_input('MonthlyIncome',min_value=0, max_value=1000000,value=20999)

  with col2:

    typeofcontact = st.selectbox("TypeofContact",['Self Enquiry','Company Invited'])
    citytier = st.selectbox('citytier',[1,2,3], index=2)
    DurationOfPitch = st.number_input('DurationOfPitch', min_value=1, max_value=60, value=6)
    ProductPitched = st.selectbox('ProductPitched',['Deluxe','Basic','Standard','Super Deluxe','King'])
    PreferredPropertyStar = st.selectbox("'PreferredPropertyStar",[3,2,1])
    NumberOfTrips = st.number_input('NumberOfTrips',min_value=0, max_value=30, value=1)


  with col3:
    NumberOfPersonVisiting = st.number_input('NumberOfPersonVisiting',min_value=1,max_value=10,value=3)
    NumberOfFollowups = st.number_input('NumberOfFollowups',min_value=0,max_value=10, value=3)
    NumberOfChildrenVisiting= st.number_input('NumberOfChildrenVisiting',min_value=0,max_value=5,value=0)
    Passport= st.selectbox('Passport',['Yes','No'],format_func=lambda x:"Yes" if x=="Yes" else "No")
    Owncar= st.selectbox('OwnCar',['Yes','No'],format_func=lambda x:"Yes" if x=="Yes" else "No")
    PitchSatisfactionScore= st.number_input('PitchSatisfactionScore',min_value=1,max_value=5,value=3)


  submitted = st.form_submit_button("Predict")

if submitted:
  input_data = {
      'Age':age,
      'TypeofContact':typeofcontact,
      'CityTier':citytier,
      'DurationOfPitch':DurationOfPitch,
      'Occupation':Occupation,
      'Gender':gender,
      'NumberOfPersonVisiting':NumberOfPersonVisiting,
      'NumberOfFollowups':NumberOfFollowups,
      'ProductPitched':ProductPitched,
      'PreferredPropertyStar':PreferredPropertyStar,
      'MaritalStatus':MaritalStatus,
      'NumberOfTrips':NumberOfTrips,
      'Passport':1 if Passport =="Yes" else 0,
      'OwnCar':1 if Owncar =="Yes" else 0,
      'PitchSatisfactionScore':PitchSatisfactionScore,
      'NumberOfChildrenVisiting':NumberOfChildrenVisiting,
      'Designation':Designation,
      'MonthlyIncome':MonthlyIncome

  }


  if st.session_state.predictor:
    result = st.session_state.predictor.Predict(input_data)

    if result is not None:
      st.subheader(f"Prediction Result is {result}")
      st.write(f"Likely to purchase" if result ==1 else "Unlikely to purchase")
    else:
      st.write(result)
      st.error("Error in prediction")
  else:
    st.error("Models are not loaded, please ensure the model and threshold are available on Hugging face")
Overwriting Deployment/app.py
      |--> class HostingInHuggingFace
          |--> def __init__(self,base_path,hf_token=None):
              * Constructor for intialize the base path and hf token
          |--> def CreatingSpaceInHF(self):
              * Fucntion to create Hugging face space to upload the file deployment file
          |--> def UploadDeploymentFile(self):
              * Uploading the deployment file into hugging face space
          |--> def ToRunPipeline(self):
              * Pipeline function to invoke the the above in sequence
In [30]:
#@title HostingInHuggingFace.py
%%writefile HostingInHuggingFace.py
import os
import inspect
import traceback
from huggingface_hub import HfApi, create_repo, login,hf_hub_download
from huggingface_hub.utils import RepositoryNotFoundError

class HostingInHuggingFace:
  def __init__(self,base_path,hf_token=None):
    self.base_path = base_path
    self.hf_token = hf_token
    self.repo_id = 'jpkarthikeyan/Tourism-Prediction-Model-Space'

  def CreatingSpaceInHF(self):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    api = HfApi()
    try:
      print(f"Checking for {self.repo_id} is correct or not")
      api.repo_info(repo_id = self.repo_id,
                    repo_type='space',
                    token = self.hf_token)
      print(f"Space {self.repo_id} already exists")
    except RepositoryNotFoundError:
      create_repo(repo_id=self.repo_id,
                  repo_type='space',
                  space_sdk='docker',
                  private=False,
                  token=self.hf_token)
      print(f"Space created in {self.repo_id}")
    except Exception as ex:
      print(f"Exception in creating space {ex}")
      traceback.print_exc()
    finally:
      print('-'*50)


  def UploadDeploymentFile(self):
    print(f"Function Name {inspect.currentframe().f_code.co_name}")
    try:
      api = HfApi(token=self.hf_token)
      directory_to_upload = os.path.join(self.base_path,'Deployment')
      print(f"Directory to upload {directory_to_upload} into HF Space {self.repo_id}")
      api.upload_folder(repo_id=self.repo_id, folder_path=directory_to_upload,
                        repo_type='space')
      print(f"Successfully upload {directory_to_upload} into {self.repo_id}")


      return True
    except Exception as ex:
      print(f"Exception occured {ex}")
      print(traceback.print_exc())
      return False
    finally:
      print('-'*50)

  def ToRunPipeline(self):
    try:
      self.CreatingSpaceInHF()
      if self.UploadDeploymentFile():
        print('Deployment pipeline completed')
        return True
      else:
        print('Deployment pipeline failed')
        return False
    except Exception as ex:
      print(f"Exception occured {ex}")
      print(traceback.print_exc())
      return False
    finally:
      print('-'*50)
Overwriting HostingInHuggingFace.py

Main Function¶

In [8]:
%cd '/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/'
/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master
In [14]:
os.getcwd()
Out[14]:
'/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master'
In [9]:
%%writefile main.py
import os
import sys
import argparse
from dotenv import load_dotenv

try:
  base_path = os.path.abspath((os.path.dirname(__file__)))
except:
  base_path = os.path.join(os.getcwd(),'Master')
  print(base_path)

print(f'Base path {base_path}')

sys.path.append(base_path)

data_dir = os.path.join(base_path, 'Data')
model_dir = os.path.join(base_path,'Model_Dump_JOBLIB')
#job = ['register','prepare']
#job = 'prepare'

parser = argparse.ArgumentParser(description='Run a specific job in the pipeline')
parser.add_argument('--job', type=str, required=True,
                    choices=['register','prepare','modelbuilding','deploy'],
                    help='Job To execute register,prepare,modelbuilding,deploy')
args = parser.parse_args()

os.makedirs(data_dir, exist_ok=True)
os.makedirs(model_dir, exist_ok=True)
load_dotenv(dotenv_path=os.path.join(base_path,'.env'))
hf_token = os.getenv('HF_TOKEN')
if not hf_token:
  raise ValueError("HF_TOKEN not found in .env file")

if args.job == 'register':
  from DataRegistration import DataRegistration
  data_reg = DataRegistration(base_path, hf_token)
  if not data_reg.ToRunPipeline():
    sys.exit(1)
elif args.job == 'prepare':
  from DataPrepration import DataPrepration
  obj_data_prep = DataPrepration(base_path,hf_token)
  if not obj_data_prep.ToRunPipeline():
    sys.exit(1)
elif args.job == 'modelbuilding':
  from BuildingModels import BuildingModels
  ObjBuildModel = BuildingModels(base_path,hf_token)
  if not ObjBuildModel.ToRunPipeline():
    sys.exit(1)
elif args.job == 'deploy':
  from HostingInHuggingFace import HostingInHuggingFace
  Obj_deploy = HostingInHuggingFace(base_path,hf_token)
  if not Obj_deploy.ToRunPipeline():
    sys.exit(1)
Overwriting main.py
In [40]:
#@title Invoking the DataRedistration.py from main.py | !python main.py --job register
!python main.py --job register
Base path /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master
Function Name __init__
self.Subfolders: /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Data
self.folder_Master: /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master
folder_data: /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Data
--------------------------------------------------
Function Name ToRunPipeline
Function Name HFCreateRepo
Repo jpkarthikeyan/Tourism-visit-with-us-dataset created
----------------------------------------------------------------------------------------------------
--------------------------------------------------
Function Name UploadingSourceData
Soruce Data File /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Data/tourism.csv
Source data tourism.csv uploaded into jpkarthikeyan/Tourism-visit-with-us-dataset
----------------------------------------------------------------------------------------------------
Data Registration Completed
--------------------------------------------------
In [41]:
#@title Invoking the DataPrepration.py from main.py | !python main.py --job prepare
!python main.py --job prepare
Base path /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master
Function Name __init__
self.repoID: jpkarthikeyan/Tourism-visit-with-us-dataset
self.Subfolders: /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Data
--------------------------------------------------
Function Name ToRunPipeline
Function Name LoadDatasetFromHF
Shape of the original dataset (4128, 21)
Dataset loaded from jpkarthikeyan/Tourism-visit-with-us-dataset//content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Data
Shape of the Original Dataset: (4128, 20)
--------------------------------------------------
Function Name TrainTestSplit
Value Count ProdTaken
0    3331
1     797
Name: count, dtype: int64
Shape of the train dataset: (3302, 20)
Shape of the test dataset: (826, 20)
--------------------------------------------------
Function Name DatasetCleaning
--------------------------------------------------
Function Name DatasetCleaning
--------------------------------------------------
Function Name UploadIntoHF
Source data train.csv uploaded into jpkarthikeyan/Tourism-visit-with-us-dataset
--------------------------------------------------
Function Name UploadIntoHF
Source data test.csv uploaded into jpkarthikeyan/Tourism-visit-with-us-dataset
--------------------------------------------------
Dataset downloaded from HF, Cleaned, Splitted into train and test dataset and uploaded back into HF dataset
--------------------------------------------------
In [43]:
#@title Invoking the BuildingModels.py from main.py | !python main.py --job modelbuilding
!python main.py --job modelbuilding
Base path /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master
Function Name __init__
ML Run path: /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/mlruns
Tracking URI file:///content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/mlruns
Experiment ID <Experiment: artifact_location='file:///content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/mlruns/458707103760693873', creation_time=1755962521638, experiment_id='458707103760693873', last_update_time=1755962521638, lifecycle_stage='active', name='Tourism-Prediction-Experiment', tags={}>
Function Name ToRunPipeline
Function Name Load_data_from_HF
Loading the train dataset from jpkarthikeyan/Tourism-visit-with-us-dataset
Shape of the train dataset: (3302, 19)
Shape of the train dataset: (826, 19)
--------------------------------------------------
Function Name Preprocessing_dataset
--------------------------------------------------
Function Name Building_Models
Model DecisionTreeClassifier started
Fitting 3 folds for each of 50 candidates, totalling 150 fits
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=5, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=5, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=5, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=5, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time=   0.0s
Model path: /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Model_Dump_JOBLIB/DecisionTreeClassifier.joblib
model:Pipeline(steps=[('preprocessor',
                 ColumnTransformer(transformers=[('num',
                                                  Pipeline(steps=[('imputer',
                                                                   SimpleImputer(strategy='median')),
                                                                  ('scaler',
                                                                   StandardScaler())]),
                                                  ['Age', 'CityTier',
                                                   'DurationOfPitch',
                                                   'NumberOfPersonVisiting',
                                                   'NumberOfFollowups',
                                                   'PreferredPropertyStar',
                                                   'NumberOfTrips', 'Passport',
                                                   'PitchSatisfactionScore',
                                                   'OwnCar',
                                                   'NumberOfChildrenVisiting',
                                                   'MonthlyIncome']),
                                                 ('onehot',
                                                  OneHotEncoder(drop='first',
                                                                handle_unknown='ignore',
                                                                sparse_output=False),
                                                  ['TypeofContact',
                                                   'Occupation', 'Gender',
                                                   'ProductPitched',
                                                   'MaritalStatus',
                                                   'Designation'])])),
                ('classifier',
                 DecisionTreeClassifier(class_weight='balanced', max_depth=1,
                                        min_samples_leaf=2, min_samples_split=5,
                                        random_state=42, splitter='random'))])
best_score: 0.4412564666937607
best_params: {'classifier__splitter': 'random', 'classifier__min_samples_split': 5, 'classifier__min_samples_leaf': 2, 'classifier__max_features': None, 'classifier__max_depth': 1, 'classifier__criterion': 'gini'}
Modle DecisionTreeClassifier completed
--------------------------------------------------
Model RandomForestClassifier started
Fitting 3 folds for each of 50 candidates, totalling 150 fits
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.9s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.9s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.9s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time=   1.1s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time=   1.0s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time=   1.0s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time=   1.1s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.8s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.9s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.9s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.9s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.9s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.8s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   1.1s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   1.1s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.9s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time=   1.0s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time=   1.1s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time=   1.0s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.8s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.8s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.8s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.7s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.7s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.7s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time=   0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.8s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time=   0.8s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time=   1.1s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time=   0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time=   0.4s
Model path: /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Model_Dump_JOBLIB/RandomForestClassifier.joblib
model:Pipeline(steps=[('preprocessor',
                 ColumnTransformer(transformers=[('num',
                                                  Pipeline(steps=[('imputer',
                                                                   SimpleImputer(strategy='median')),
                                                                  ('scaler',
                                                                   StandardScaler())]),
                                                  ['Age', 'CityTier',
                                                   'DurationOfPitch',
                                                   'NumberOfPersonVisiting',
                                                   'NumberOfFollowups',
                                                   'PreferredPropertyStar',
                                                   'NumberOfTrips', 'Passport',
                                                   'PitchSatisfactionScore',
                                                   'OwnCar',
                                                   'NumberOfChildrenVisit...
                                                  OneHotEncoder(drop='first',
                                                                handle_unknown='ignore',
                                                                sparse_output=False),
                                                  ['TypeofContact',
                                                   'Occupation', 'Gender',
                                                   'ProductPitched',
                                                   'MaritalStatus',
                                                   'Designation'])])),
                ('classifier',
                 RandomForestClassifier(class_weight='balanced',
                                        criterion='entropy', max_depth=15,
                                        max_features=0.6, min_samples_leaf=7,
                                        min_samples_split=20, n_estimators=25,
                                        oob_score=True, random_state=42))])
best_score: 0.6512043836331847
best_params: {'classifier__oob_score': True, 'classifier__n_estimators': 25, 'classifier__min_samples_split': 20, 'classifier__min_samples_leaf': 7, 'classifier__max_features': 0.6, 'classifier__max_depth': 15, 'classifier__criterion': 'entropy', 'classifier__bootstrap': True}
Modle RandomForestClassifier completed
--------------------------------------------------
Model GradientBoostingClassifier started
Fitting 3 folds for each of 50 candidates, totalling 150 fits
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.5s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.6s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.6; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.6; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.6; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.5s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.5s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.6s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.8; total time=   0.5s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.8; total time=   0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.7; total time=   0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.8; total time=   0.5s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.7; total time=   0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.7; total time=   0.5s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.8s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.8s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.6; total time=   0.5s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.9s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.6; total time=   0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.6; total time=   0.6s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.7; total time=   0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.7; total time=   0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.7; total time=   0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.7; total time=   0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.7; total time=   0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.7; total time=   0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time=   0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time=   0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time=   0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.6; total time=   0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.6; total time=   0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.6; total time=   0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.6; total time=   0.5s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.6; total time=   0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.6; total time=   0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.6; total time=   0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.6; total time=   0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.6; total time=   0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.6; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.6; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.6; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.6; total time=   0.6s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.6; total time=   0.6s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.6; total time=   0.6s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.7s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.7s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.8; total time=   0.1s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.8; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.7s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.6; total time=   0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.6; total time=   0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.6; total time=   0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.7; total time=   0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.7; total time=   0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.7; total time=   0.6s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.8s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.8s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.8; total time=   0.6s
Model path: /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Model_Dump_JOBLIB/GradientBoostingClassifier.joblib
model:Pipeline(steps=[('preprocessor',
                 ColumnTransformer(transformers=[('num',
                                                  Pipeline(steps=[('imputer',
                                                                   SimpleImputer(strategy='median')),
                                                                  ('scaler',
                                                                   StandardScaler())]),
                                                  ['Age', 'CityTier',
                                                   'DurationOfPitch',
                                                   'NumberOfPersonVisiting',
                                                   'NumberOfFollowups',
                                                   'PreferredPropertyStar',
                                                   'NumberOfTrips', 'Passport',
                                                   'PitchSatisfactionScore',
                                                   'OwnCar',
                                                   'NumberOfChildrenVisiting',
                                                   'MonthlyIncome']),
                                                 ('onehot',
                                                  OneHotEncoder(drop='first',
                                                                handle_unknown='ignore',
                                                                sparse_output=False),
                                                  ['TypeofContact',
                                                   'Occupation', 'Gender',
                                                   'ProductPitched',
                                                   'MaritalStatus',
                                                   'Designation'])])),
                ('classifier',
                 GradientBoostingClassifier(learning_rate=0.5, max_depth=5,
                                            max_features='log2',
                                            random_state=42, subsample=0.8))])
best_score: 0.6906142009293693
best_params: {'classifier__subsample': 0.8, 'classifier__n_estimators': 100, 'classifier__min_samples_leaf': 1, 'classifier__max_features': 'log2', 'classifier__max_depth': 5, 'classifier__learning_rate': 0.5, 'classifier__criterion': 'friedman_mse'}
Modle GradientBoostingClassifier completed
--------------------------------------------------
--------------------------------------------------
{'DecisionTreeClassifier': {'model': Pipeline(steps=[('preprocessor',
                 ColumnTransformer(transformers=[('num',
                                                  Pipeline(steps=[('imputer',
                                                                   SimpleImputer(strategy='median')),
                                                                  ('scaler',
                                                                   StandardScaler())]),
                                                  ['Age', 'CityTier',
                                                   'DurationOfPitch',
                                                   'NumberOfPersonVisiting',
                                                   'NumberOfFollowups',
                                                   'PreferredPropertyStar',
                                                   'NumberOfTrips', 'Passport',
                                                   'PitchSatisfactionScore',
                                                   'OwnCar',
                                                   'NumberOfChildrenVisiting',
                                                   'MonthlyIncome']),
                                                 ('onehot',
                                                  OneHotEncoder(drop='first',
                                                                handle_unknown='ignore',
                                                                sparse_output=False),
                                                  ['TypeofContact',
                                                   'Occupation', 'Gender',
                                                   'ProductPitched',
                                                   'MaritalStatus',
                                                   'Designation'])])),
                ('classifier',
                 DecisionTreeClassifier(class_weight='balanced', max_depth=1,
                                        min_samples_leaf=2, min_samples_split=5,
                                        random_state=42, splitter='random'))]), 'best_score': np.float64(0.4412564666937607), 'best_params': {'classifier__splitter': 'random', 'classifier__min_samples_split': 5, 'classifier__min_samples_leaf': 2, 'classifier__max_features': None, 'classifier__max_depth': 1, 'classifier__criterion': 'gini'}}, 'RandomForestClassifier': {'model': Pipeline(steps=[('preprocessor',
                 ColumnTransformer(transformers=[('num',
                                                  Pipeline(steps=[('imputer',
                                                                   SimpleImputer(strategy='median')),
                                                                  ('scaler',
                                                                   StandardScaler())]),
                                                  ['Age', 'CityTier',
                                                   'DurationOfPitch',
                                                   'NumberOfPersonVisiting',
                                                   'NumberOfFollowups',
                                                   'PreferredPropertyStar',
                                                   'NumberOfTrips', 'Passport',
                                                   'PitchSatisfactionScore',
                                                   'OwnCar',
                                                   'NumberOfChildrenVisit...
                                                  OneHotEncoder(drop='first',
                                                                handle_unknown='ignore',
                                                                sparse_output=False),
                                                  ['TypeofContact',
                                                   'Occupation', 'Gender',
                                                   'ProductPitched',
                                                   'MaritalStatus',
                                                   'Designation'])])),
                ('classifier',
                 RandomForestClassifier(class_weight='balanced',
                                        criterion='entropy', max_depth=15,
                                        max_features=0.6, min_samples_leaf=7,
                                        min_samples_split=20, n_estimators=25,
                                        oob_score=True, random_state=42))]), 'best_score': np.float64(0.6512043836331847), 'best_params': {'classifier__oob_score': True, 'classifier__n_estimators': 25, 'classifier__min_samples_split': 20, 'classifier__min_samples_leaf': 7, 'classifier__max_features': 0.6, 'classifier__max_depth': 15, 'classifier__criterion': 'entropy', 'classifier__bootstrap': True}}, 'GradientBoostingClassifier': {'model': Pipeline(steps=[('preprocessor',
                 ColumnTransformer(transformers=[('num',
                                                  Pipeline(steps=[('imputer',
                                                                   SimpleImputer(strategy='median')),
                                                                  ('scaler',
                                                                   StandardScaler())]),
                                                  ['Age', 'CityTier',
                                                   'DurationOfPitch',
                                                   'NumberOfPersonVisiting',
                                                   'NumberOfFollowups',
                                                   'PreferredPropertyStar',
                                                   'NumberOfTrips', 'Passport',
                                                   'PitchSatisfactionScore',
                                                   'OwnCar',
                                                   'NumberOfChildrenVisiting',
                                                   'MonthlyIncome']),
                                                 ('onehot',
                                                  OneHotEncoder(drop='first',
                                                                handle_unknown='ignore',
                                                                sparse_output=False),
                                                  ['TypeofContact',
                                                   'Occupation', 'Gender',
                                                   'ProductPitched',
                                                   'MaritalStatus',
                                                   'Designation'])])),
                ('classifier',
                 GradientBoostingClassifier(learning_rate=0.5, max_depth=5,
                                            max_features='log2',
                                            random_state=42, subsample=0.8))]), 'best_score': np.float64(0.6906142009293693), 'best_params': {'classifier__subsample': 0.8, 'classifier__n_estimators': 100, 'classifier__min_samples_leaf': 1, 'classifier__max_features': 'log2', 'classifier__max_depth': 5, 'classifier__learning_rate': 0.5, 'classifier__criterion': 'friedman_mse'}}}
Function Name Model_Evaluation
Predict proability shape DecisionTreeClassifier (826, 2)
best threshold: 0.7076288212958588
Figure(300x300)
Predict proability shape RandomForestClassifier (826, 2)
best threshold: 0.43784253782494276
Figure(300x300)
Predict proability shape GradientBoostingClassifier (826, 2)
best threshold: 0.24867338889476726
Figure(300x300)
--------------------------------------------------
                        model  accuracy  precision    recall  f1_score
0      DecisionTreeClassifier  0.699758   0.326848  0.528302  0.403846
1      RandomForestClassifier  0.868039   0.625000  0.786164  0.696379
2  GradientBoostingClassifier  0.917676   0.766082  0.823899  0.793939
Function Name Register_BestModel_HF
Uploading the best model into Hugging face
BestModel_GradientBoostingClassifier.joblib: 100% 480k/480k [00:00<00:00, 645kB/s]
Uploading the best threshold text file to HF
2025/08/24 04:00:47 WARNING mlflow.models.model: `artifact_path` is deprecated. Please use `name` instead.
/usr/local/lib/python3.12/dist-packages/mlflow/types/utils.py:452: UserWarning: Hint: Inferred schema contains integer column(s). Integer columns in Python cannot represent missing values. If your input data contains missing values at inference time, it will be encoded as floats and will cause a schema enforcement error. The best way to avoid this problem is to infer the model schema based on a realistic data sample (training dataset) that includes missing values. Alternatively, you can declare integer columns as doubles (float64) whenever these columns may have missing values. See `Handling Integers With Missing Values <https://www.mlflow.org/docs/latest/models.html#handling-integers-with-missing-values>`_ for more details.
  warnings.warn(
/usr/local/lib/python3.12/dist-packages/mlflow/types/utils.py:452: UserWarning: Hint: Inferred schema contains integer column(s). Integer columns in Python cannot represent missing values. If your input data contains missing values at inference time, it will be encoded as floats and will cause a schema enforcement error. The best way to avoid this problem is to infer the model schema based on a realistic data sample (training dataset) that includes missing values. Alternatively, you can declare integer columns as doubles (float64) whenever these columns may have missing values. See `Handling Integers With Missing Values <https://www.mlflow.org/docs/latest/models.html#handling-integers-with-missing-values>`_ for more details.
  warnings.warn(
--------------------------------------------------
--------------------------------------------------
In [29]:
#@title Invoking the HostingInHuggingFace.py from main.py | !python main.py --job deploy
!python main.py --job deploy
Base path /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master
Function Name CreatingSpaceInHF
Checking for jpkarthikeyan/Tourism-Prediction-Model-Space is correct or not
Space jpkarthikeyan/Tourism-Prediction-Model-Space already exists
--------------------------------------------------
Function Name UploadDeploymentFile
Directory to upload /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Deployment into HF Space jpkarthikeyan/Tourism-Prediction-Model-Space
No files have been modified since last commit. Skipping to prevent empty commit.
Successfully upload /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Deployment into jpkarthikeyan/Tourism-Prediction-Model-Space
--------------------------------------------------
Deployment pipeline completed
--------------------------------------------------

pipeline.yml¶

In [36]:
%cd '/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/'
/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1
In [ ]:
!ls
Master
        |--> pipeline.yml
            |--> Initializing
            |--> jobs
                  |--> register-dataset
                        |--> Set up job
                        |--> Checkout Repository
                        |-->Setup python
                        |--> Install Dependencies
                        |--> List Directory Contents(debug)
                        |--> CCopy tourism.csv from local
                        |--> Run DataRegistration
                        |-->Checkpipeline status
                        |--> Verfiy upload
                        |--> post setup python
                        |--> Post Checkout Repository
                        |--> Complete Job
                  |--> data-prepration
                        |--> Set up job
                        |--> Checkout Repository
                        |--> Set up Python
                        |--> Install Dependencies
                        |--> Copy tourism.csv
                        |--> Run DataPrepration.py
                        |--> Check Pipeline status
                        |--> Verify Upload
                        |--> Post Set up Python
                        |--> Post Checkout Repository
                        |--> Complete Job
                  |--> model-building
                        |--> Set up job
                        |--> Checkout Repository
                        |--> Set up Python
                        |--> Install Dependencies
                        |--> Create Model Dump Directory
                        |--> Run Model BuildingModels.py
                        |--> Check pipeline status
                        |--> Verify Execution
                        |--> List Generated Files
                        |--> Commit and Push Generated Files
                        |--> Pull Remote Changes
                        |--> Push Generated Files
                        |--> Post Setup Python
                        |--> Post Checkout Repository
                        |--> complete job
                  |--> deploy-to-success
                        |--> Set up job
                        |--> Checkput Repository
                        |--> Setup python
                        |--> INSTALL DEPENDENCIES
                        |--> Set up Docker Buildx
                        |--> Debug Authentication
                        |--> Login to Github Container Registryy
                        |-->Build and push docker image to Github Container Registry
                        |--> Deploy to Hugging Faces spaces
                        |--> Checkout Deployment stauts
                        |--> Post set up Docker Buildx
                        |--> Post Setup Python
                        |--> Post Checkout Repository
                        |--> Complete Job
In [37]:
%%writefile .github/workflows/pipeline.yml
name: Visit With Us Toursim Prediction Pipeline

on:
  push:
    branches:
      - main # Automatically triggers on push to the main branch
    paths:
      - 'Master/Data/tourism.csv'
      - 'Master/DataRegistration.py'
      - 'Master/DataPrepration.py'
      - 'Master/BuildingModels.py'
      - 'Master/main.py'
      - 'Master/HostingInHuggingFace.py'
      - '.github/workflows/pipeline.yml'
      - 'Master/Deployment/**'
  workflow_dispatch:

jobs:
  register-dataset:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout Repository
        uses: actions/checkout@v3

      - name: Setup python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Install Dependencies
        run: |
          python -m pip install --upgrade pip
          pip install huggingface_hub python-dotenv

      - name: List Directory Contents(Debug)
        run: |
          ls -la Master/Data/ || echo "Master/Data/ directory not found"
          ls -la . || echo "Root Directory contents"

      - name: Copy tourism.csv(if using a local file)
        run: |
          mkdir -p Master/Data
          if [ -f tourism.csv ]; then
            cp tourism.csv Master/Data/
            echo "Copied tourism.csv from root to Master/Data/"
          else
            echo "tourism.csv not found in root attemtpting to download from hugging face"

            python -c "from huggingface_hub import hf_hub_download;hf_hub_download(repo_id='jpkarthikeyan/Tourism-visit-with-us-dataset',filename='tourism.csv',local_dir='Master/Data/')"
            fi

      - name: Run Data Registration
        env:
          HF_TOKEN: ${{ secrets.HF_TOKEN }}
        run: |
          cd Master
          python main.py --job register
        continue-on-error: false

      - name: Check Pipeline status
        if: failure()
        run: |
          echo "Data Registration pipeline failed. please check logs"
          exit 1

      - name: Verify Upload
        env:
          HF_TOKEN: ${{ secrets.HF_TOKEN }}
        run: |
          echo "Verifying Upload on Hugging Face"
          python -c "import os;from huggingface_hub import HfApi;api= HfApi(token=os.getenv('HF_TOKEN'));print(api.repo_info(repo_id='jpkarthikeyan/Tourism-visit-with-us-dataset',repo_type='dataset'))"

  data-prepration:
    runs-on: ubuntu-latest
    needs: register-dataset
    steps:
      - name: Checkout Repository
        uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Install Dependencies
        run: |
          python -m pip install --upgrade pip
          pip install pandas numpy huggingface_hub python-dotenv datasets scikit-learn

      - name: Copying tourism.csv
        run: |
          mkdir -p Master/Data
          cp tourism.csv Master/Data || echo "tourism.csv not found in root"

      - name: Run DataPrepration.py
        env:
          HF_TOKEN: ${{ secrets.HF_TOKEN }}
        run: |
          cd Master
          python main.py --job prepare
        continue-on-error: false

      - name: Check Pipeline Status
        if: failure()
        run: |
          echo "Data Prepration pipeline failed. please check the log"
          exit 1
      - name: Verify Upload
        env:
          HF_TOKEN: ${{ secrets.HF_TOKEN }}
        run: |
          echo "Verifying Upload on Hugging Face"
          python -c "import os; from huggingface_hub import HfApi; token = os.getenv('HF_TOKEN');print(HfApi(token=token).repo_info(repo_id='jpkarthikeyan/Tourism-visit-with-us-dataset', repo_type='dataset'))"

  model-building:
    runs-on: ubuntu-latest
    needs: data-prepration
    steps:
      - name: Checkout Repository
        uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Install Dependencies
        run: |
          python -m pip install --upgrade pip
          pip install huggingface_hub python-dotenv pandas numpy scikit-learn joblib xgboost seaborn matplotlib datasets mlflow

      - name: Create Model Dump Directory
        run: |
          mkdir -p Master/Model_Dump_JOBLIB
          mkdir -p Master/mlruns
      - name: Set Permission for MLFlow and Model Directories
        run: |
          mkdir -p Master/mlruns && chmod -R 777 Master/mlruns
          mkdir -p Master/Model_Dump_JOBLIB && chmod -R 777 Master/Model_Dump_JOBLIB
      - name: Debug Directory Contents
        run: |
          ls -la Master/
          ls -la Master/Model_Dump_JOBLIB/ || echo "Model_Dump_JOBLIB is empty"
          ls -la Master/mlruns/ || echo "mlruns is empty"

      - name: Run Model Building
        env:
          HF_TOKEN: ${{ secrets.HF_TOKEN }}
          MLFLOW_TRACKING_URI: file://${{ github.workspace }}/Master/mlruns
        run: |
          cd Master
          python main.py --job modelbuilding
        continue-on-error: false

      - name: Check pipeline status
        if: failure()
        run: |
          echo "Exception in Build Models. please check logs"
          exit 1

      - name: Verify Execution
        env:
          HF_TOKEN: ${{ secrets.HF_TOKEN }}
        run: |
          echo "Verifying the execution"
          python -c "import os; from huggingface_hub import HfApi;token=os.getenv('HF_TOKEN');print(HfApi(token=token).repo_info(repo_id='jpkarthikeyan/Tourism_Prediction_Model',repo_type='model')) "

      - name: List Generated Files
        run: |
          ls -l Master/Model_Dump_JOBLIB/



      - name: Commit and Push Generated Files
        run: |
          git config --global user.name 'github-actions[bot]'
          git config --global user.email 'github-actions[bot]@users.noreply.github.com'
          git add Master/Model_Dump_JOBLIB/*
          git commit -m "Adding genearated model files and confusion matrix plots" || echo "No changes to commit"

          git pull origin main --rebase || {
             echo "Merge Conflict detectd. Aborting rebase and skipping"
             git rebase --abort
             exit 0
             }

          git pull origin main
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

      - name: Handle Rebase Failure
        if: failure()
        run:
          echo "Rebase failed. cleaning up"
          git rebase --abort || true
          exit 0


  deploy-to-spaces:
    runs-on: ubuntu-latest
    permissions:
      packages: write
      contents: read
      actions: read
    needs: model-building
    steps:
      - name: Checkout Repository
        uses: actions/checkout@v3

      - name: SET UP PYTHON
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: INSTALL DEPENDENCIES
        run: |
          python -m pip install --upgrade pip
          pip install huggingface_hub python-dotenv

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Debug Authentication
        run: |
          echo "Actor: $GITHUB_ACTOR"
          echo "PAT_TOKEN is set: ${PAT_TOKEN:+[SET_${#PAT_TOKEN}_chars]}"
          if [ -z "$PAT_TOKEN" ]; then
            echo "PAT_TOKEN is empty";
          else
            echo "PAT_TOKEN length: ${#PAT_TOKEN}";
          fi

      - name: Login to GITHUB CONTAINER REGISTRY
        env:
          PAT_TOKEN: ${{ secrets.PAT_TOKEN }}
        run: |
          echo "Login to GITHUB Container Reistry"
          echo $PAT_TOKEN | docker login -u ${GITHUB_ACTOR} --password-stdin ghcr.io
          echo "Docker Login succss"

      - name: Build and Push Docker image to GITHUB Container REGISTRY
        env:
          PAT_TOKEN: ${{ secrets.PAT_TOKEN }}
        run: |
          cd Master/Deployment
          docker build -t jpkarthik/tourism-prediction-app:latest .
          docker tag jpkarthik/tourism-prediction-app:latest ghcr.io/jpkarthik/tourism-prediction-app:latest
          docker push ghcr.io/jpkarthik/tourism-prediction-app:latest

      - name: Deploy to Hugging Face Spaces
        env:
          HF_TOKEN: ${{ secrets.HF_TOKEN }}
        run: |
          cd Master
          python main.py --job deploy
          echo "Deployment To HF Space"


      - name: Check Deployment Status
        if: failure()
        run: |
          echo "Deployment to Huggingface space failed please check logs"
          exit 1
Overwriting .github/workflows/pipeline.yml

ngrok¶

In [ ]:
os.getcwd()
Out[ ]:
'/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master'
In [19]:
import os
import subprocess
from google.colab import userdata
from pyngrok import ngrok
import mlflow

ngrok_key = userdata.get('ngrok_key')
print(ngrok_key)
ngrok.set_auth_token(ngrok_key)
mlrun_path = os.path.join(base_path,'mlruns')
os.environ['MLFLOW_TRACKING_URI'] = f'file://{mlrun_path}'
mlflow.set_tracking_uri(f"file://{mlrun_path}")
print(mlflow.get_tracking_uri())

mlflow_process = subprocess.Popen(
    ["mlflow", "ui", "--host", "0.0.0.0","--port", "5000"],
    stdout = subprocess.PIPE,
    stderr = subprocess.PIPE,
    preexec_fn = os.setsid

)

import time
time.sleep(5)
try:
  import requests
  response = requests.get("http://localhost:5000")
  if response.status_code ==200:
    print("MLFlow UI is running")
  else:
    print(f"MLFlow is not running: {response.status_code}")
except Exception as ex:
  print(f"MLFlow is not running: {ex}")
  stdout,stderr = mlflow_process.communicate()
  print(f"stdout: {stdout.decode()}")
  print(f"stderr: {stderr.decode()}")

public_url = ngrok.connect(5000)
print(f"Mlflow UI running at: {public_url}")
31gqLnmI3u79Sy91o3rDCpVDEuD_6ZUJqvvEfDqbhN6rFxfaM
file:///content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/mlruns
MLFlow UI is running
Mlflow UI running at: NgrokTunnel: "https://ed778018fffb.ngrok-free.app" -> "http://localhost:5000"

GIT STEPS TO PUSH THE CODE FROM LOCAL TO REMOTE¶

STEP1

from google.colab import drive

drive.mount('colab/drive')

Step 02 %cd content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/

Step 03

apt-get install git

Step 04

git init

Step 06

git config --global user.email "jpkaimlgl@gmail.com"

git config --global user.name "jpkarthik"

Step 07

git remote add origin https://github.com/jpkarthik/VisitWithUs-ColabNotebook/

Step 08

git add Master/**

Step 09

git add .github/workflows/pipeline.yml

Step 10

git commit -m "Comments"

Step 11 git push origin main

MLFLOW SCREENSHOT¶

MLFLOW Home page¶

image.png

Experiment - Tourism-Prediction-Experiment¶

Models

image.png

Best Models¶

image.png

Experiments--> Tourism-Prediction-Experiment --> Run¶

image.png

Best Model Metrics page¶

image.png

HuggingFace Screenshot¶

Dataset: https://huggingface.co/datasets/jpkarthikeyan/Tourism-visit-with-us-dataset

Models: https://huggingface.co/jpkarthikeyan/Tourism_Prediction_Model

Spaces: https://huggingface.co/jpkarthikeyan/Tourism_Prediction_Model

HuggingFace Dataset¶

image.png

HuggingFace dataset FileVersion¶

image.png

HuggingFace Models Page¶

image.png

HuggingFace Space¶

Files

image.png

App Page -- > Unlikely to purchase¶

image.png

App Page --> Likely to purchase¶

image.png

GITHUB Screenshots¶

https://github.com/jpkarthik/VisitWithUs-ColabNotebook

Root Folder Path

image.png

GITHUB FOLDER STRUCTURE¶

image.png

GITHUB PIPELINE EXECUTION¶

image.png

GITHUB ACTIONS TAB¶

image.png

BUSINESS RECOMENDATION AND CONCLUSIONN¶

The Model was created using the DTree,Random Forest classifier and GradientBoosting classifier

From the training Gradient Boosting Classifier was gievn highest metrices

  • Accuracy 91%
  • Precision 76%
  • Recall 83%
  • F1-Score 79%

Business Recomendation Based on the Gradient Boosting Classifiers performance and the tourism prediction context, here are actionable business recomendation for VisitWithUs to Oprimize customer targeting and increase package sales:

  1. Priortize High-Potential Customers:
  • Use the GradientBoosting Classifier to identify customers with a high likehood of purchasing tourism packages. the models high recall(83%) ensures that most potential buyers are captured, reduced missed opportunties

  • Focus marketting effots(e.g., personalized emails, discounts or tailored promotions) on customers predicted as positive by the model, expecially those above the best threshold(0.7076 as identified in previous logs)

  1. Optimize MArketing Resources:
  • The model's precision 76.61% indicated that 76.61% of predicted positive cases are correct, helping to minimize wasted resources on unlikely customers. allocate budgets to high- probability leads to improve return on ROI

  • Use features importance from gradient bossitng to understand key drivers of purchased decision(e.g., Age, MonthlyIncome, NumberOfTrips). Tailor to emphasize features that resonate with high-value customers for luxury packages for high income customers

  1. Enhance Customer Engagement
  • For Customer with lower predicition probabilietes develop nuturing campaignsto convert them over time
  • Leverage the models insights to segment customers by demographies or behaviour(e.g., TypeOfContact, Occupation, Designation) for personalized engagement strategies
  1. Monitor and Refine Model performance
  • Continously track the models performance in production using metrics like F1-Score and accuracy update the model with new customer data to maintain its predictive power
  1. Streamline Operations with autoation:
  • Integrate the deployed streamlit app into the company CRM system to provide real -time predictions for sales teams. This enables quick decision making during customer interactions
  • Automate followup processes for high probability leads using the app's output reducing manual effort and improving efficiency
In [9]:
from google.colab import drive
drive.mount('/content/drive/')
%cd '/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/'
Drive already mounted at /content/drive/; to attempt to forcibly remount, call drive.mount("/content/drive/", force_remount=True).
/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master
In [12]:
!ls
BuildingModels.py	 main.py
Data			 Model_Dump_JOBLIB
DataPrepration.py	 __pycache__
DataRegistration.py	 README.md
Deployment		 Visit-With-Us-Tourism-Prediction_v1_1.ipynb
HostingInHuggingFace.py
In [10]:
!pip install nbconvert
Requirement already satisfied: nbconvert in /usr/local/lib/python3.12/dist-packages (7.16.6)
Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (4.13.4)
Requirement already satisfied: bleach!=5.0.0 in /usr/local/lib/python3.12/dist-packages (from bleach[css]!=5.0.0->nbconvert) (6.2.0)
Requirement already satisfied: defusedxml in /usr/local/lib/python3.12/dist-packages (from nbconvert) (0.7.1)
Requirement already satisfied: jinja2>=3.0 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (3.1.6)
Requirement already satisfied: jupyter-core>=4.7 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (5.8.1)
Requirement already satisfied: jupyterlab-pygments in /usr/local/lib/python3.12/dist-packages (from nbconvert) (0.3.0)
Requirement already satisfied: markupsafe>=2.0 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (3.0.2)
Requirement already satisfied: mistune<4,>=2.0.3 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (3.1.3)
Requirement already satisfied: nbclient>=0.5.0 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (0.10.2)
Requirement already satisfied: nbformat>=5.7 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (5.10.4)
Requirement already satisfied: packaging in /usr/local/lib/python3.12/dist-packages (from nbconvert) (25.0)
Requirement already satisfied: pandocfilters>=1.4.1 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (1.5.1)
Requirement already satisfied: pygments>=2.4.1 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (2.19.2)
Requirement already satisfied: traitlets>=5.1 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (5.7.1)
Requirement already satisfied: webencodings in /usr/local/lib/python3.12/dist-packages (from bleach!=5.0.0->bleach[css]!=5.0.0->nbconvert) (0.5.1)
Requirement already satisfied: tinycss2<1.5,>=1.1.0 in /usr/local/lib/python3.12/dist-packages (from bleach[css]!=5.0.0->nbconvert) (1.4.0)
Requirement already satisfied: platformdirs>=2.5 in /usr/local/lib/python3.12/dist-packages (from jupyter-core>=4.7->nbconvert) (4.3.8)
Requirement already satisfied: jupyter-client>=6.1.12 in /usr/local/lib/python3.12/dist-packages (from nbclient>=0.5.0->nbconvert) (6.1.12)
Requirement already satisfied: fastjsonschema>=2.15 in /usr/local/lib/python3.12/dist-packages (from nbformat>=5.7->nbconvert) (2.21.2)
Requirement already satisfied: jsonschema>=2.6 in /usr/local/lib/python3.12/dist-packages (from nbformat>=5.7->nbconvert) (4.25.1)
Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.12/dist-packages (from beautifulsoup4->nbconvert) (2.7)
Requirement already satisfied: typing-extensions>=4.0.0 in /usr/local/lib/python3.12/dist-packages (from beautifulsoup4->nbconvert) (4.14.1)
Requirement already satisfied: attrs>=22.2.0 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (25.3.0)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (2025.4.1)
Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (0.36.2)
Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (0.27.0)
Requirement already satisfied: pyzmq>=13 in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (26.2.1)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (2.9.0.post0)
Requirement already satisfied: tornado>=4.1 in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (6.4.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.1->jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (1.17.0)
In [14]:
!jupyter nbconvert --to html Visit-With-Us-Tourism-Prediction_v1_1.ipynb
[NbConvertApp] Converting notebook Visit-With-Us-Tourism-Prediction_v1_1.ipynb to html
[NbConvertApp] Writing 4837072 bytes to Visit-With-Us-Tourism-Prediction_v1_1.html